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ABSTRACT 

This  paper  develops  a  new  modified  Bayesian  Kriging 
(MBKG)  surrogate  modeling  method  for  problems  in  which 
simulation  analyses  are  inherently  noisy  and  thus  standard 
Kriging  approaches  fail  to  properly  represent  the  responses.  The 
purpose  is  to  develop  a  method  that  can  be  used  to  carry  out 
reliability  analysis  to  predict  probability  of  failure.  The 
formulation  of  the  MBKG  surrogate  modeling  method  is 
presented,  and  the  full  conditional  distributions  of  the  unknown 
MBKG  parameters  are  presented.  Using  the  full  conditional 
distributions  with  a  Gibbs  sampling  algorithm,  Markov  chain 
Monte  Carlo  is  used  to  fit  the  MBKG  surrogate  model.  A 
sequential  sampling  method  that  uses  the  posterior  credible  sets 
for  inserting  new  design  of  experiment  (DoE)  sample  points  is 
proposed.  The  sequential  sampling  method  is  developed  in  such 
a  way  that  the  newly  added  DoE  sample  points  will  provide  the 
maximum  amount  of  information  possible  to  the  MBKG 
surrogate  model,  making  it  an  efficient  and  effective  way  to 
reduce  the  number  of  DoE  sample  points  needed.  Therefore,  the 
proposed  method  improves  the  posterior  distribution  of  the 
probability  of  failure  efficiently.  To  demonstrate  the  developed 
MBKG  and  sequential  sampling  methods,  a  2-D  mathematical 
example  with  added  random  noise  is  used.  It  is  shown  how,  with 
the  use  of  the  sequential  sample  method,  the  posterior 
distribution  of  the  probability  of  failure  converges  to  capture  the 
true  probability  of  failure.  A  3-D  multibody  dynamics  (MBD) 
engineering  block-car  example  illustrates  an  application  of  the 
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new  method  to  a  simple  engineering  example  for  which  standard 
Kriging  methods  fail. 
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1.  INTRODUCTION 

Reliability  analysis  methods  have  been  developed  to 
analyze  the  reliability  of  product  designs  subject  to  the 
randomness  of  the  input  variables.  Most  reliability  analysis 
methods  can  be  classified  into  one  of  two  groups,  the  first  being 
sensitivity-based  methods  and  the  second  being  sampling-based 
methods.  The  literature  is  rich  with  numerous  sensitivity-based 
methods  that  have  been  developed  using  the  most  probable  point 
[1-6].  Some  common  sensitivity-based  methods  are  the  first- 
order  reliability  method  (FORM)  [5,  7,  8],  the  second-order 
reliability  method  (SORM)  [5,  7-9],  and  the  dimension  reduction 
method  (DRM)  [3,  10,  11].  While  these  methods  can  be 
computationally  cheaper  than  sampling-based  methods,  one 
pitfall  is  that  they  may  not  be  as  accurate  as  sampling-based 
methods  for  highly  nonlinear  problems  [12].  Another 
shortcoming  of  these  methods  is  that  they  require  the  sensitivity 
of  the  performance  measures  to  be  available.  Obtaining  the 
sensitivity,  while  possible  for  some  problems,  can  be  a  daunting, 
if  not  impossible,  task  for  some  problems  that  are  highly 
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nonlinear  and/or  coupled  with  fluid  structure  interaction,  e.g., 
crash  and  blast  problems. 

To  overcome  the  shortfalls  of  the  sensitivity-based  methods, 
sampling-based  methods  have  been  developed.  A  brute  force 
approach  would  be  to  do  direct  Monte  Carlo  simulation  (MCS) 
using  the  computer-aided  engineering  (CAE)  simulation  models, 
e.g.,  finite  element  (FE)  models  and  computational  fluid 
dynamic  (CFD)  models,  to  calculate  the  reliability.  While  this 
method  can  be  highly  accurate  using  a  large  number  of  MCS 
points  [5],  its  limitation  is  that  it  requires  very  significant 
computational  cost  due  to  a  large  number  of  CAE  model 
simulations,  thus  rendering  it  impractical.  In  order  to  overcome 
the  impractical  computational  cost  of  direct  MCS  using  CAE 
simulations  and  to  overcome  the  pitfalls  of  the  sensitivity-based 
methods,  the  use  of  surrogate  models  is  becoming  a  more 
common  practice  [13-18].  There  are  three  advantages  of  using 
surrogate  models.  The  first  is  that  the  sensitivity  of  the 
performance  measure  is  not  needed  to  construct  the  surrogate. 
The  second  is  that  surrogate  models  are  computationally 
inexpensive  to  use  for  evaluating  large  numbers  of  MCS  points 
compared  to  the  computational  cost  of  the  CAE  models.  The 
third  is  that  surrogate  models  can  be  built  using  a  limited  number 
of  CAE  model  simulations,  therefore  reducing  the  overall 
computational  cost  of  performing  a  reliability  analysis. 

The  literature  is  full  of  numerous  surrogate  modeling 
methods  that  have  been  developed  over  decades  and  are  still 
being  actively  developed.  The  main  reason  for  all  the 
development  is  that  there  is  not  one  surrogate  modeling  method 
that  works  for  every  problem.  Some  of  the  common  surrogate 
modeling  methods  in  the  literature  are,  polynomial  response 
surface  (PRS)  [17,  19-21],  polynomial  chaos  [22],  moving  least 
squares  (MLS)  [19],  multivariate  adaptive  regression  splines 
(MARS)  [23],  support  vector  machine  and  support  vector 
regression  [19,  20],  virtual  support  vector  machine  (VSVM) 
[24],  radial  basis  functions  (RBF)  [19,  20,  23],  neural  networks 
(NN)  [25],  ordinary  Kriging  (OKG)  and  universal  Kriging 
(UKG)  [19,  20,  23,  26,  27],  and  dynamic  Kriging  (DKG)  [15, 
16].  There  have  been  surrogate  modeling  methods  developed 
that  use  Bayesian  methods  [28-31].  However,  none  of  the 
methods  studied  claim  to  do  a  hill  Bayesian  analysis  when 
creating  the  surrogate  model;  they  are  only  borrowing  concepts 
from  the  Bayesian  methods. 

As  previously  described,  a  surrogate  model  is  used  to  do  the 
MCS  prediction  for  the  reliability  analysis  for  the  sampling- 
based  reliability-based  design  optimization  (RBDO)  method  to 
ease  the  computational  burden.  Thus,  the  accuracy  of  the 
reliability  analysis  depends  on  the  accuracy  of  the  surrogate 
model.  Kriging  has  become  a  popular  surrogate  modeling 
method  because  it  offers  flexibility  in  the  choice  of  both  the  mean 
structure  and  the  correlation  function  used  [19,  20].  Dynamic 
Kriging  was  developed  because  it  was  found  to  be  more  accurate 
than  OKG  or  UKG,  giving  better  reliability  analysis  results  [15, 
16,  46,  47].  However,  OKG,  UKG,  and  DKG  are  typically 
formulated  and  used  as  interpolation  methods  and  therefore 
break  down  when  the  response  data  contains  noise  [19,  20,  32, 
33].  There  are  approximation  and  regression  methods,  e.g.  MLS 
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and  MARS,  which  can  be  used  when  the  response  data  contains 
noise.  The  formulation  of  Kriging  can  even  be  modified  to 
change  it  to  a  regression  method  [19,  20,  32,  33]. 

The  disadvantage  of  these  methods  is  that  they  do  not  allow 
for  a  direct  way  to  separate  the  noise  from  the  data.  Thus,  when 
using  them  to  predict  response  values,  it  is  not  known  how  much 
noise  there  may  be  in  the  predicted  response  value.  If  these 
regression  surrogate  models  are  used  for  MCS  prediction  for 
reliability  analysis,  this  will  affect  the  amount  of  uncertainty  and 
variability  that  there  is  in  the  reliability  analysis.  It  is  not  clear 
or  easy  to  distinguish  if  the  variability  in  the  reliability  analysis 
is  due  to  the  noise  in  the  predicted  response  or  due  to  the 
uncertainty  in  the  surrogate  model  itself.  Another  disadvantage 
of  these  methods  is  that  they  do  not  have  a  systematic  way  of 
characterizing  the  uncertainty  of  the  surrogate  model,  the  noise 
in  the  response  value,  or  the  uncertainty  in  the  predicted  response 
values  using  the  surrogate  model.  Therefore,  a  surrogate 
modeling  method  that  can  systematically  characterize  all  of  these 
uncertainties  as  well  as  accurately  predict  the  true  underlying 
response  value  without  noise  needs  to  be  investigated.  Bayesian 
statistical  methods  allow  a  natural  and  systematic  way  of 
characterizing  uncertainty  when  estimating  both  unknown 
parameters  of  distributions  and  other  quantities  of  interest  that 
depend  on  these  unknown  distribution  parameters. 

In  this  paper  a  new  modified  Bayesian  Kriging  (MBKG) 
surrogate  modeling  method  is  developed  for  handling 
performance  measures  of  noisy  simulations,  which  will 
accurately  represent  the  true  underlying  unknown  performance 
function  without  noise.  The  likelihood  for  the  MBKG  is 
proposed,  and  the  prior  distributions  to  be  used  with  the 
likelihood  to  fit  the  Bayesian  model  are  presented.  When 
possible,  conjugate  prior  distributions  are  used  to  help  simplify 
the  full  conditional  distributions  and  to  ease  the  computational 
burden  of  fitting  the  surrogate  model.  Using  the  full  conditional 
distributions  with  a  Gibbs  sampling  algorithm  [34],  Markov 
chain  Monte  Carlo  is  used  to  fit  the  MBKG  surrogate  model. 
Computer  simulations  of  engineering  models  are  often 
computationally  expensive;  therefore,  it  is  necessary  to  reduce 
the  number  of  design  of  experiment  (DoE)  samples  needed.  An 
efficient  DoE  sampling  method  using  the  credible  sets  of  the 
MBKG  model  is  developed  to  systematically  reduce  the 
uncertainty  in  the  surrogate  model.  Finally  the  developed 
MBKG  surrogate  modeling  method  together  with  the  developed 
sequential  sampling  method  are  demonstrated  using  a  2-D 
mathematical  example  and  a  3-D  multibody  dynamics 
engineering  block-car  example.  It  is  shown  how  the  posterior 
distribution  of  the  probability  of  failure  is  improved  efficiently. 

2.  REVIEW  OF  BAYESIAN  STATISTICS 

2. 1  Likelihood  and  Prior  Distributions 

In  real-world  problems  there  is  often  data  available  or 
obtainable  that  is  known  or  assumed  to  come  from  a  given 
distribution  type,  e.g.,  the  data  is  known  to  follow  a  normal 
distribution.  The  distribution  from  which  the  data  comes  is 
referred  to  as  the  sampling  distribution  or  data  distribution  [35, 
36].  However,  even  when  the  distribution  type  of  the  data  is 
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known,  the  parameter  values  of  the  distribution  are  often 
unknown.  When  the  data  distribution  is  considered  as  a  function 
of  the  unknown  distribution  parameters  for  a  given  data  set,  it  is 
referred  to  as  the  likelihood  function  [35-38]. 

The  goal  in  Bayesian  statistics  is  to  come  up  with  an 
estimate  of  the  unknown  parameter  values  using  both  the 
available  data  and  prior  knowledge  about  the  unknown 
parameters.  In  Bayesian  statistics  the  unknown  parameters  are 
treated  as  if  they  are  random  variables.  Any  prior  knowledge  or 
belief  about  the  unknown  parameter  is  expressed  using  a 
probability  distribution.  The  definition  of  subjective  probability 
of  an  event  given  by  [37]  is:  “A  probability  of  an  event  or  of  the 
truth  of  a  statement  is  a  number  between  0  and  1  that  quantifies 
a  particular  person’s  subjective  opinion  as  to  how  likely  that 
event  is  to  occur  (or  to  have  already  occurred)  or  how  likely  the 
statement  is  to  be  true.”  Similarly,  a  subjective  probability 
distribution  quantifies  one’s  knowledge  about  an  unknown 
parameter  that  may  take  on  any  value  in  a  continuum.  These 
probability  distributions  that  express  one’s  knowledge  about  the 
unknown  parameters  are  referred  to  as  the  prior  distributions  or 
simply  as  the  priors  [35-38].  A  Bayesian  analysis  is  carried  out 
to  update  one’s  subjective  probability  distribution  of  the 
unknown  parameters  by  combining  prior  information  with  the 
new  information  contained  in  the  data.  The  next  section  will 
describe  how  Bayes’  rule  is  used  with  the  likelihood  and  prior 
distributions  to  calculate  the  posterior  distribution. 

2.2  Bayes’  Rule  and  Posterior  Distributions 

Bayes’  rule  provides  a  way  to  mathematically  and 
systematically  update  one’s  subjective  probability  distribution 
on  model  parameters.  All  knowledge  available  before  the 
current  data  is  observed  is  encapsulated  in  the  prior.  The 
updating  occurs  by  incorporating  the  new  data  contained  in  the 
likelihood.  Bayes’  rule  states  that  the  posterior  probability 
distribution  of  model  parameters  given  the  data  is  proportional 
to  the  product  of  the  likelihood  and  prior  probability  distribution 
[35-38].  Mathematically  this  can  be  written  as  shown  in  Eq.  (1). 

/  ( Parameter  \  Data )  oc  /  ( Parameter )  x 
/  ( Data  |  Parameter ) 

The  final  probability  distribution  given  on  the  left  side  of 
Eq.  (1)  is  called  the  posterior  distribution,  often  referred  to  as  the 
posterior.  The  posterior  distribution  expresses  the  current  state 
of  knowledge  about  model  parameters.  The  posterior 
distribution  can  be  used  to  obtain  desired  values  about  the 
parameter,  e.g.,  the  mean  or  median  of  the  posterior  could  be 
used  as  a  point  estimate  for  the  unknown  parameter  value.  The 
posterior  variance  reveals  the  amount  of  uncertainty  that  remains 
about  the  parameter  value — the  larger  the  variance,  the  larger  the 
uncertainty  about  what  the  true  parameter  value  is.  The  posterior 
can  also  be  used  to  give  probability  intervals,  called  credible  sets, 
which  are  believed  to  contain  the  true  parameter  value  with  the 
specified  probability,  e.g.,  the  95%  credible  set  for  the  parameter 
can  easily  be  obtained  from  the  posterior.  Credible  sets  have  a 
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slightly  different  meaning  than  confidence  intervals.  For  the 
95%  credible  set,  the  probability  that  the  true  value  of  the 
parameter  is  in  that  interval  is  95%.  A  95%  confidence  interval 
means  that,  if  the  same  experiment  or  test  is  repeated  many  times 
to  generate  many  different  data  sets  and  each  data  set  is  used  to 
generate  a  95%  confidence  interval,  only  95%  of  the  confidence 
intervals  generated  would  capture  the  true  parameter  value, 
while  5%  of  the  confidence  intervals  generated  would  not 
capture  the  true  parameter  value  [35,  37].  The  next  section  will 
discuss  the  use  of  conjugate  priors  and  Markov  chain  Monte 
Carlo  for  updating  the  posterior  distribution. 

2.3  Conjugate  Priors  and  Markov  Chain  Monte  Carlo 

The  product  of  the  prior  distribution  and  likelihood  is  used 
in  Bayes’  rule  to  construct  the  posterior  distribution  of  the 
unknown  parameter(s).  When  possible,  it  is  desirable  to  choose 
the  prior  distribution  from  a  parametric  family  that  takes  on  the 
same  functional  form  as  the  likelihood  function  for  the  unknown 
parameter(s);  priors  of  such  a  form  are  called  conjugate  priors. 
The  parameters  of  the  prior  distribution  are  then  chosen  such  that 
the  prior  reflects  the  known  information  and  beliefs  about  the 
unknown  parameter(s).  When  applying  Bayes’  rule  to  the 
likelihood  with  a  conjugate  prior,  the  posterior  distribution  will 
belong  to  the  same  parametric  family  as  the  prior,  i.e.,  the 
posterior  distribution  type  will  be  the  same  distribution  type  as 
the  prior.  The  parameter  values  of  the  posterior  distribution  will 
be  a  combination  of  the  prior  parameter  values  and  the  data  used 
for  the  Bayesian  analysis  [35-37]. 

There  are  scenarios  in  which  a  conjugate  prior  does  not  exist 
for  a  given  problem,  e.g.,  a  conjugate  prior  does  not  adequately 
reflect  the  prior  knowledge  and  belief  about  the  unknown 
parameter(s),  or  there  are  multiple  unknown  parameters  for 
which  a  conjugate  joint  distribution  does  not  exist.  For  such 
scenarios,  any  distribution  type  that  reflects  the  prior  knowledge 
and  belief  about  the  unknown  parameter(s)  can  be  used  as  the 
prior.  The  use  of  such  priors,  however,  leads  to  posterior 
distributions  that  most  likely  are  not  from  a  known  distributional 
family.  Bayes’  rule  can  still  be  applied  in  such  cases  but  this  has 
to  be  done  using  a  numerical  method. 

Markov  chain  Monte  Carlo  (MCMC)  is  a  numerical  method 
that  can  be  used  to  draw  samples  from  high-dimensional  and 
nonstandard  probability  distributions.  One  disadvantage  of 
MCMC  is  that  the  samples  drawn  from  the  distribution  are  not 
independent,  and  this  needs  to  be  taken  into  consideration  when 
using  the  samples  for  inference  [35-37,  39].  The  Markov 
property  states  that  a  sample  drawn  at  a  given  time  point 
conditional  on  the  sample  drawn  at  the  time  point  immediately 
before  it  is  independent  of  all  the  earlier  samples  drawn.  Under 
certain  regularity  conditions,  it  can  be  shown  that  a  Markov 
chain  will  converge  in  distribution  to  samples  drawn  from  the 
target,  i.e.,  posterior  distribution  [35,  37,  39].  There  is  a  debate 
over  whether  it  is  better  to  run  one  long  Markov  chain  or  to  run 
multiple  shorter  parallel  Markov  chains  starting  at  different 
initial  values  to  attempt  to  determine  whether  the  Markov  chain 
has  converged  to  the  target  distribution  [40].  One  common  and 
well-accepted  way  of  diagnosing  convergence  in  distribution 
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when  using  MCMC  is  to  use  the  Brooks,  Gelman,  and  Rubin 
(BGR)  diagnostic  [41,  42].  The  BGR  diagnostic  requires 
running  at  least  two  parallel  Markov  chains  starting  at  over¬ 
dispersed  initial  values.  The  BGR  diagnostic  uses  the  samples 
drawn  from  the  parallel  Markov  chains  to  calculate  credible  sets 
of  the  individual  chains  using  an  increasing  number  of  samples 
from  the  chain.  The  parallel  chains  are  also  pooled  together  to 
form  one  sample  set  that  is  used  to  calculate  credible  sets  using 
an  increasing  number  of  samples.  If  widths  of  the  credible  sets 
calculated  using  the  two  different  methods  stabilize  and  become 
approximately  equal,  the  MCMC  chains  are  likely  to  have 
converged  in  distribution  [37,  41,  42].  It  is  possible  for  the  BGR 
diagnostic  to  misdiagnose  convergence,  i.e.,  the  BGR  diagnostic 
shows  that  convergence  has  been  achieved  when  in  actuality 
convergence  has  not  yet  been  reached. 

3.  REVIEW  OF  KRIGING  AND  DYNAMIC  KRIGING 

3.1  Kriging  Method 

The  Kriging  method  is  based  on  the  assumption  that  the 
response  values  of  interest  are  a  realization  from  a  stochastic 
process.  Consider  N  design  of  experiment  (DoE)  samples, 
denoted  as  \DoE  =  (x1,x2,...,xAr)T ,  and  the  corresponding  N 
responses,  denoted  as  y  =  (y19y2 ,...,yN)T ,  where  xmgT 
and  m  is  the  number  of  input  variables,  i.e.,  the  spatial 
dimension  of  the  input  variables.  The  Kriging  model  for  the 
responses  is  composed  of  two  parts  and  is  expressed 
mathematically  as 

y  =  Fp  +  Z  (1) 

where  Fp  is  the  mean  structure  of  the  response, 
F  =  [fy(x.)],z  =  1,2,..., N,j  =  1,2,...,K  isa  NxK  designmatrix 
where  f.  (x.)  is  the  f  basis  function  evaluated  at  the  ith  DoE 

sample  point,  and  P  =  [J319J32,...9/3K]T  are  the  regression 
coefficients  from  the  generalized  least  squares  regression 
method.  The  second  part,  Z  ,  is  a  realization  from  a  stationary 
Gaussian  random  process  with  zero  mean  and  covariance 
function  given  by 

E(x,.,x.)  =  o-lK(0,x,.,xy)  (2) 

where  cr2  is  the  process  variance,  R  is  the  spatial  correlation 
function,  0  is  a  vector  containing  the  correlation  function 
parameters,  and  x. ,  x .  are  two  DoE  samples.  There  are  a 

number  of  different  correlation  functions  that  can  be  used,  e.g., 
Gaussian,  exponential,  general  exponential,  linear,  spherical, 
cubic,  and  spline.  However,  for  engineering  problems,  often  the 
Gaussian  correlation  function  is  used  since  it  is  infinitely 
differentiable  and  provides  a  smooth  response  surface.  The 
parameters  of  the  Kriging  model  that  best  fit  the  DoE  samples 
and  corresponding  response  values  are  determined  by  using  the 
maximum  likelihood  estimation  (MLE)  method. 
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3.2  Dynamic  Kriging  Method 

As  previously  mentioned,  there  are  a  number  of  different 
correlation  functions  to  choose  from,  and  there  are  also 
numerous  basis  functions  to  choose  from.  It  has  been  shown  that 
for  different  problems  and  data  some  correlation  functions  fit 
better  than  others;  it  has  also  been  shown  that  a  larger  basis 
function  set  is  not  necessarily  better  than  a  smaller  basis  function 
set  [15,  16].  The  latest  proposed  dynamic  Kriging  method 
chooses  the  mean  structure  from  a  choice  of  three  mean 
structures  and  the  correlation  function  from  seven  choices  that 
best  fit  the  data  [43].  This  dynamic  Kriging  method  provides 
more  flexibility  and  an  automated  way  to  build  a  Kriging  model 
that  can  fit  data  from  a  wide  range  of  problems.  However,  it  is 
an  interpolation  method  and  breaks  down  when  the  response 
values  contain  noise.  Because  the  standard  Kriging  and  dynamic 
Kriging  methods  are  interpolation  methods  they  do  not 
accurately  predict  the  true  underlying  response  value  when  the 
response  values  from  the  simulations  contain  noise.  As  more 
response  values  are  used  to  build  the  Kriging  models,  the 
standard  Kriging  method  tends  to  break  down,  and  the  surface 
predicted  by  the  Kriging  models  becomes  very  rough. 

4.  MODIFIED  BAYESIAN  KRIGING  (MBKG) 

4. 1  Modified  Bayesian  Kriging  Formulation 

The  basic  assumption  of  Kriging  is  that  the  response  values 
are  realizations  from  a  Gaussian  random  process.  Kriging  is 
commonly  referred  to  as  a  Bayesian  method,  though  it  is 
typically  fitted  using  MLE  methods  rather  than  Bayesian 
methods.  The  likelihood  for  the  Bayesian  model  is  the  Gaussian 
random  process. 

The  modified  Bayesian  Kriging  method  assumes  the 
response  values  come  from  a  stationary  Gaussian  random 
process  with  a  constant  mean  structure  in  the  following  form: 

y  ~  MVN^c1  +  ^,<j2m)  (3) 

where  y  is  the  vector  of  response  values  at  the  DoE  samples. 
The  above  formulation  states  that  the  response  values  are 
conditionally  independent  given  parameters  defining  a  variance 
of  cr2yl  and  a  mean  value  of  juc  1  +  (p  ,  which  is  composed  of  two 
parts,  the  first  part  is  a  constant  value  of  juc ,  and  the  second  part 
is  q> ,  which  depends  on  the  spatial  correlation  of  the  DoE 
samples  x  ;  this  is  expressed  mathematically  as 

<p~MFAf(0,cr2T)  (4) 

where  cr2xP  is  the  covariance  of  the  Gaussian  process,  cj2  is  the 
variance  of  the  (p  values,  and  is  the  spatial  correlation  matrix. 
Similar  to  the  Kriging  method,  the  spatial  correlation  matrix  is  a 
function  of  the  DoE  samples  x  via  the  correlation  function  that 
is  used.  The  Gaussian  correlation  function  is  the  most  commonly 
used  for  engineering  problems  and  is  defined  as 
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vp(0 


=  exp 


f 


V 


-I*, 


V.(0 


(5) 


where  Oj  is  the  f  correlation  function  parameter 
corresponding  to  the  f  dimension,  k  is  the  number  of  spatial 
dimensions,  xj  is  the  value  of  DoE  sample  for  the  f 

dimension,  and  superscript  i  is  for  the  ith  DoE  sample.  Table  1 
lists  seven  different  correlation  functions  from  the  literature  [43]. 


Table  1.  Correlation  Functions 


Name 

vp(0 

Gaussian 

exp 

(  k 

~idM 

V  J=l 

(0  r  \pj 

i  XJ  \ 

\ 

J 

Exponential 

exp(-0.  xf-x.  ) 

General 

Exponential 

exp(-6>y. 

|xf -X.f“’),0  <6 

1  ,  <2 
n+ 1 

Linear 

max  jo,  1-^. 

VP -4 

! 

Spherical 

1-1.5£,+O.5£,£,=min{l,0, 

Cubic 

1- 

?-4 

Spline 

< 

'1-15# +30£, 
1.25(1-<T,.)3, 

°, 

for  0  <  <  0.2 

for  0  <  <1 

for  gj  >  1, 

where  -  6j  | 

xf-x. 

The  unknown  MBKG  parameters  that  need  to  be  determined 
in  order  to  fit  the  model  to  the  data,  i.e.,  fit  the  model  to  the  DoE 
samples  x  and  their  corresponding  response  values  y ,  are  juc , 
cr2 ,  A  ,  and  0  .  The  dimension  of  the  0  vector  is  the  same  as 
the  spatial  dimension  of  x  .  From  the  Bayesian  perspective,  the 
likelihood  of  the  response  values  is  the  Gaussian  process  given 
in  Eq.  (3),  and  the  prior  for  the  (p  vector  is  the  Gaussian  process 
given  in  Eq.  (4).  The  prior  distributions  for  the  remaining 
unknown  parameters  will  be  presented  in  the  next  section. 

The  modified  Bayesian  Kriging  method  can  also  be 
formulated  to  have  a  mean  structure  that  is  a  function  of  the  DoE 
samples  x  .  For  this  formulation  Eq.  (3)  is  modified  to  give: 


y  ~  MVN (p  +  (p,  g2 y II )  (6) 

where  juc  1  is  replaced  by  the  vector  p ,  which  is  a  function  of 
the  DoE  sample  points.  The  p  vector  is  expressed  as: 


P  —  Fp 


(7) 


where  F  is  the  design  matrix  composed  of  the  polynomial  basis 
functions  evaluated  at  the  DoE  sample  points.  For  first  order  the 
design  matrix  F  is  expressed  as 


F  = 


1 

1 


1 


A m  • 

..  rW 

x<2) 

x(2) 

X'n 

(m) 

(m) 

x; 

. .  x 

Xn 

lmx(«+l) 


(8) 


where  m  is  the  number  of  DoE  and  n  is  the  number  of  variables, 
i.e.,  the  dimension  of  the  problem,  and  p  is  a  vector  of  unknown 

coefficients  to  be  determined  when  fitting  the  MBKG  surrogate 
model. 

4.2  Prior  Distributions  for  MBKG  Parameters 

For  the  three  unknown  parameters  juc ,  a2 ,  and  A  ,  semi¬ 
conjugate  prior  distributions  are  used  to  help  improve  the 
efficiency  of  fitting  the  MBKG  surrogate  model.  For  the  juc 
parameter,  the  conjugate  prior  distribution  is  a  normal 
distribution  and  is  expressed  as 

Me  ~  N(f*P>ol)  (9) 

where  fi  and  g2  are  the  prior  mean  and  variance  of  the  juc 

parameter,  respectively.  The  conjugate  prior  distribution  for  the 
a2  parameter  is  an  Inverse-Gamma  distribution  and  is  expressed 
as 


g2  ~  Inver seGamma {aG,PG )  (10) 

where  aG  and  PG  are  the  prior  parameters  for  the  distribution. 
The  conjugate  prior  for  the  A  parameter  is  also  an  Inverse- 
Gamma  distribution  and  is  expressed  as 

A  ~  InverseGamma (a (11) 

where  a}  and  /T  are  the  prior  parameters  for  the  distribution. 
The  0  parameters  are  embedded  in  the  correlation  matrix,  and 
because  of  this  there  is  no  known  conjugate  distribution  type  that 
can  be  used  for  the  prior  distribution.  Thus,  the  prior  distribution 
for  each  of  the  0  parameters  is  chosen  to  be  a  uniform 
distribution  and  is  expressed  as 
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0,.~U  (aej,bgj)  (12) 

where  Oj  is  the  jth  correlation  function  parameter,  and  ae.  and 
be  are  the  prior  parameters  for  . 

If  the  mean  structure  is  not  constant,  then  the  \i  vector  is 
used  as  in  Eq.  (7)  and  the  conjugate  prior  is  a  multivariate  normal 
distribution  and  is  expressed  as 


P -MVN^,^)  (13) 


where  and  2^  are  the  prior  mean  vector  and  covariance 
matrix  for  the  distribution. 

Prior  to  fitting  the  MBKG  model,  the  response  values  y  are 
normalized  such  that  they  have  a  zero  mean.  The  normalization 
is  expressed  as 


yi  -  mean(y) 
std(  y) 


(14) 


where  y.  is  the  normalized  response  value  of  the  ith  response, 
y.  is  the  original  response  value  for  the  ith  response,  mean(y) 
is  the  mean  of  the  original  response  values,  and  std(y)  is  the 
standard  deviation  of  the  original  response  value.  The  DoE 
samples  are  also  normalized  in  the  same  way  as  the  response 
values.  In  the  literature  it  is  has  been  found  that,  when  fitting 
surrogate  models,  a  better  fit  can  usually  be  achieved  when  the 
DoE  samples  and  response  values  are  normalized  in  some  way 
to  help  restrict  the  range  of  the  possible  values  than  when  they 
are  not  normalized  [20]. 

In  addition  to  the  normal  distribution  being  a  conjugate  prior 
for  the  juc  parameter,  it  is  also  a  good  choice  because  of  the 
normalization  of  the  response  values.  It  is  expected  that  the 
overall  mean  of  the  normalized  response  values  should  be  close 
to  zero.  Thus,  a  normal  distribution  with  a  zero  mean  and  a 
standard  deviation  of  0.5  is  a  prior  distribution  that  reflects  this 
knowledge.  This  prior  distribution  can  still  be  considered  a 
relatively  noninformative  prior  distribution;  thus,  it  can  be  used 
in  general  when  creating  a  surrogate  model  for  any  problem. 

The  Inverse-Gamma  distribution  has  positive  support, 
meaning  that  the  random  variable  of  an  Inverse-Gamma 
distribution  can  only  take  on  positive  values.  From  Eqs.  (3)  and 
(4)  it  can  be  seen  that  both  parameters  cr2  and  A  are  variance 
parameters  to  a  Gaussian  process;  therefore,  by  definition  they 
can  only  take  on  positive  values.  With  the  Inverse-Gamma 
distribution  being  a  positive  value  distribution  and  a  conjugate 
prior  for  both  cr2  and  A ,  it  is  ideally  suited  to  be  used  as  the 
prior  distribution. 

A  uniform  distribution  for  the  0  parameters  was  chosen 
mostly  because  it  gives  a  direct  way  to  limit  the  possible  values 


that  the  0  parameters  can  take  on.  Due  to  the  normalization  of 
the  DoE  samples,  the  maximum  distance  in  each  of  the  spatial 
directions  from  any  given  DoE  sample  to  all  the  others  is  roughly 
three.  After  studying  the  Gaussian  correlation  function  closely, 
it  was  found  that  a  6  value  of  five  and  greater  produces  a 
correlation  value  that  is  asymptotically  approaching  zero 
regardless  of  the  distance  between  the  DoE  samples.  This  is 
shown  in  Fig.  1,  where  the  vertical  axis  is  the  squared  distance 
between  DoE  sample  points,  and  the  horizontal  axis  is  the 
correlation  function  parameter  6 .  As  seen  in  the  Fig.l,  when 
0 « 2.5  and  the  squared  distance  is  greater  than  one,  the 
correlation  is  smaller  than  0.1.  Also  seen  in  the  Fig.  1  is  that 
when  0  =  5,  the  correlation  is  smaller  than  0.1  except  for 
squared  distances  that  are  approximately  less  than  0.5.  Given 
this  study,  it  was  determined  that  an  appropriate  upper  bound 
would  be  0  =  5  and  an  appropriate  lower  bound  would  be 
<9  =  0.15,  these  are  the  bounds  used  in  the  uniform  prior 
distribution. 


Figure  1.  Correlation  Contour  of  the  Gaussian  Correlation 
Function 


4.3  Full  Conditional  Distributions  for  MBKG  Parameters 

The  joint  posterior  distribution  of  the  unknown  parameters 
in  a  Bayesian  model  is  needed  to  derive  the  full  conditional 
distributions  of  the  unknown  parameters  being  estimated.  The 
joint  posterior  distribution  for  the  MBKG  formulation  given  in 
Eq.  (6)  is  expressed  as 

/(//C,c r2,0,A,(p  I  y)  oc  n[UK  ’h0j )] * 

7=1 

InverseGamma  (aa,/3a)*  N(jUp,cr2p)*  ^  j 

InverseGamma  (  a  -  ,/?,)*  MVN  (0,cr2'P)* 

MVN  (fi  +  (p,  (j2/ll ) 

The  derivation  of  the  full  conditional  distributions  these 
distributions  is  not  new  work  and  therefore  in  the  interest  of 
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space  are  not  repeated  in  this  paper.  However,  the  full 
conditionals  for  each  of  the  parameters  are  given.  The  full 
conditional  for  cr2  can  be  shown  to  be 


f(&2  |  y,//c,0,A5<p)  ~  Inver seGamma  aa+n , 


+ “(y  -  n  -  <pf  (y  -  h  -  <p) 


(16) 


The  hill  conditional  for  A  can  be  shown  to  be 


/  (/l  |  y,  ,  cr2 , 0,  (p)  ~  InverseGamma 

1  1  T 

Pjl+t— (y-i»-<p)  (y-n-<p) 

Z  CJ 


a ;  +- 


(17) 


The  full  conditional  for  the  6.  parameter  can  be  shown  to  be 


Ln[f(dj\y,juc,<y2, 0 , A, <p )]  oc  - ^ Ln [a2 ] - 

1  f  1  A  (18) 

2L"M  +  {-2<VT^) 

To  draw  samples  from  the  nonstandard  distribution  given  in  Eq. 
(17),  the  Metropolis-Hastings  algorithm  [44,  45]  will  be  used. 
For  numerical  stability,  the  natural  logarithm  of  the  full 
conditional  is  used  with  the  Metropolis-Hastings  algorithm. 

The  full  conditional  distribution  for  q>  can  be  shown  to  be 

f(p\y,cr2,A,Q,<?)~MVN(y,A-')  (19) 

where  y ,  A  ,  and  b  are  defined  in  Eqs.  (20),  (21),  and  (22) 
respectively. 


Y  =“A_1b  (20) 

A-(?irF+(s')1  (21) 

b-2((^)r(^)"-^j(fr-yr)F)  (K) 

The  full  conditional  distribution  for  p  can  be  shown  to  be 

/(p|y,Cj2,A,0,(p)~M17V(y,A-1)  (22) 


5.  SEQUENTIAL  SAMPLING  VIA  CREDIBLE  SETS 

Recall  that  the  objective  of  developing  the  MBKG  surrogate 
modeling  method  is  so  that  it  can  be  used  for  carrying  out 
reliability  analyses.  When  using  surrogate  models  to  carry  out 
reliability  analysis  it  is  common  practice  to  normalize  the 
response  values  such  that  failure  is  defined  to  be  when  the 
response  value  is  greater  than  zero.  This  zero  boundary  that 
separates  the  failure  and  safe  region  is  called  the  limit  state. 
Thus,  in  order  to  use  an  MBKG  surrogate  model  for  carrying  out 
a  reliability  analysis,  the  accuracy  of  the  limit  state  function  is  of 
high  importance  because  it  will  dictate  the  accuracy  of  the 
reliability  analysis.  A  sequential  sampling  method  is  developed 
for  adding  more  design  of  experiment  (DoE)  samples  to  further 
improve  the  modified  Bayesian  Kriging  (MBKG)  surrogate 
model  by  utilizing  the  information  in  the  posterior  credible  sets 
of  the  MBKG  surrogate  model.  The  method  is  developed  such 
that  it  improves  the  accuracy  of  the  limit  state  as  more  DoE 
samples  are  added. 

The  MBKG  surrogate  model  is  not  a  deterministic  surrogate 
model,  but  rather  a  surrogate  model  that  produces  posterior 
distributions  for  the  MBKG  parameters.  Therefore,  a  predicted 
response  value  for  a  given  point  does  not  have  one  deterministic 
value  but  rather  has  a  probability  distribution  that  gives  the 
probability  of  the  predicted  response  value  being  in  any  interval. 
The  Markov  chain  Monte  Carlo  (MCMC)  samples  drawn  from 
the  predictive  distribution  of  the  response  variable  can  be  used 
to  estimate  any  desired  characteristic  of  the  distribution,  e.g.,  the 
mean,  standard  deviation,  and  credible  sets.  The  larger  the 
standard  deviation  and  the  wider  the  credible  set,  the  more 
uncertainty  there  is  in  the  predicted  value,  i.e.,  more  uncertainty 
in  what  the  true  value  is.  The  probability  distributions 
characterize  this  uncertainty  and  provide  information  that  can  be 
used  to  further  improve  the  MBKG  surrogate  model. 

When  adding  new  DoE  samples  to  improve  the  MBKG 
surrogate  model,  it  is  desirable  to  place  them  far  from  the 
existing  samples  so  that  they  will  contribute  more  information  to 
the  model.  It  is  also  desirable  to  place  them  at  locations  where 
the  amount  of  uncertainty  in  the  limit  state  is  the  largest.  Placing 
the  new  DoE  samples  in  this  strategic  way  will  provide  the 
maximum  amount  of  information  to  the  MBKG  surrogate  model 
and,  thus  will  require  fewer  DoE  samples  to  be  used  to  fit  the 
MBKG  surrogate  model. 

To  select  the  DoE  samples  to  fit  the  MBKG  surrogate  model, 
random  uniform  samples  are  generated  throughout  the  domain 
of  the  variables  and  are  referred  to  as  test  points.  That  is,  500 
random  uniform  test  points  may  be  generated  for  a  2-D  or  3-D 
problem.  After  the  test  points  are  created,  a  small  number,  e.g., 
25,  of  the  test  points  are  selected  randomly  so  that  they  uniformly 
cover  the  domain  of  the  variables.  These  initially  selected  test 
points  are  then  used  to  fit  the  MBKG  surrogate  model.  After  the 
MBKG  surrogate  model  has  been  fitted,  all  of  the  test  points  are 
then  used  to  generate  the  predictive  distribution  for  each 
response  value  corresponding  to  each  test  point.  These 
predictive  distributions  characterize  the  uncertainty  in  the 
predicted  response  values.  The  wider  the  distribution,  the  more 
uncertainty  in  the  response  value. 
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To  reduce  the  uncertainty  in  the  limit  state,  it  is  desirable  to 
select  test  points  whose  predicted  responses  have  95%  credible 
sets  that  contain  zero  and  are  wide.  If  the  credible  set  contain 
zeros  this  indicates  that  the  limit  state  may  be  near  by  the  test 
point.  If  the  credible  set  is  wide  this  indicates  there  is  a  large 
amount  of  uncertainty  in  the  predicted  response.  Test  points  with 
these  two  characteristics  are  ideal  candidates  for  being  new 
additional  DoE  samples  that  can  be  used  to  fit  the  MBKG 
surrogate  model  as  they  will  add  the  most  information  to  the 
surrogate  model.  These  additional  DoE  samples  can  then  be 
used  together  with  the  initial  DoE  samples  to  refit  the  MBKG 
surrogate  model.  This  process  can  be  done  iteratively  to 
sequentially  select  DoE  samples  to  be  used  to  fit  the  MBKG 
surrogate  model. 

It  is  also  desirable  to  select  the  sequentially  added  DoE 
samples  such  that  they  are  far  from  existing  DoE  samples  so  that 
they  will  contribute  the  most  information  to  the  MBKG  surrogate 
model.  When  using  a  Kriging  method  it  is  desirable  for  DoE 
samples  to  be  far  from  existing  samples  to  avoid  computational 
problems  with  evaluating  the  correlation  matrix.  As  the  distance 
between  DoE  samples  approaches  zero,  the  estimated  correlation 
approaches  one.  This  causes  the  correlation  matrix  to  become 
ill-conditioned  and  leads  to  the  correlation  matrix  becoming 
singular.  In  this  paper  it  is  assumed  that  the  true  underlying 
surface  without  noise  is  smooth.  Thus,  DoE  samples  that  are 
extremely  close  to  each  are  likely  to  have  response  values  that 
are  close  to  each  other.  These  DoE  samples  and  response  values 
will  provide  little  information  to  improve  the  surrogate  model. 
Therefore,  it  is  desirable  for  sequentially  added  DoE  samples  to 
be  far  from  existing  DoE  samples  to  avoid  numerical  difficulties 
and  also  to  add  the  most  information  to  farther  improve  the 
surrogate  model. 

To  determine  the  locations  for  the  sequentially  added  DoE 
samples,  a  weighting  system  is  used.  The  first  step  is  to  calculate 
the  distance  between  the  test  points  and  the  existing  DoE  samples 
previously  used  to  fit  the  MBKG  surrogate  model.  This  gives  a 
matrix  of  distance  values  and  can  be  expressed  mathematically 
as 

Dij  =  Is7’  - t?  |  for  1  =  -  >  ntest  and  J  =  -  >  nDoE  (23) 

where  Z>  is  the  distance  between  the  Ith  test  point  and  jlh 
existing  DoE  sample,  sJ  is  the  jth  existing  DoE  sample,  tJ  is 
the  ith  test  point,  ntest  is  the  number  of  test  points  whose  95% 
credible  set  capture  zero,  and  nDoE  is  the  number  of  existing  DoE 

samples.  Once  these  distances  are  obtained,  the  next  step  is  to 
find  the  minimum  distance  from  all  the  existing  DoE  samples  for 
each  test  point.  This  then  gives  a  vector  of  the  distance  between 
the  test  point  and  the  closest  existing  DoE  sample,  which  is 
expressed  mathematically  as 

d,  =  min  [Dt]  }  for  j  =  1, ... ,  nDoE  (24) 


where  is  the  distance  between  the  ith  test  point  and  the  closest 
existing  DoE  sample,  D is  the  distance  between  the  ith  test 
point  and  the  jth  existing  DoE  sample  as  calculated  in  Eq.  (23), 
and  nDoE  is  the  number  of  existing  DoE  samples. 

The  last  value  needed  to  calculate  the  weight  is  the  width  of 
the  95%  credible  sets  that  capture  zero.  This  width  is  simply 
calculated  as 

c7  =  |cf  I  +  cf  for  1  =  1,...,  nlesl  (25) 

where  cf  is  the  width  of  the  95%  credible  set  for  the  ith  test 
point,  cf  is  the  lower  bound  of  the  95%  credible  set  for  the  ith 
test  point,  cf  is  the  upper  bound  of  the  95%  credible  set  for  the 
Ith  test  point,  and  ntest  is  the  number  of  test  points.  Note  that 
only  the  95%  credible  sets  that  contain  zero  are  used.  Therefore 
the  lower  bound,  cf ,  is  always  less  than  zero  and  upper  bound, 
cf  ,  is  always  greater  than  zero.  The  weight  for  each  test  point 
is  then  calculated  as 

W/  =  di*c7  for  /  =  1, ... ,  ntest  (26) 

where  w.  is  the  weight  for  the  ith  test  point,  is  the  distance 
between  the  ith  test  point  and  the  closest  existing  DoE  sample  as 
calculated  in  Eq.  (24),  cf  is  the  width  of  the  95%  credible  set 
for  the  ith  test  point  as  calculated  in  Eq.  (25),  and  ntest  is  the 
number  of  test  points.  The  test  point  that  has  the  largest  weight 
given  by  Eq.  (26)  is  selected  as  the  new  DoE  sample,  expressed 
as 

snew  =  max  { w. }  for  /  =  1, ... ,  ntest  (27) 

where  snew  is  the  new  DoE  sample  selected  from  the  ntest  test 
points,  and  vv.  is  the  weight  for  the  ith  test  point.  Then  snew  is 
added  to  the  list  of  existing  DoE  samples,  and  this  selection 
process  can  be  iterated  until  the  desired  number  of  new  DoE 
samples  are  added. 

6.  EXAMPLES 

6. 1  Nonlinear  2-D  Mathematical  Example 

A  2-D  mathematical  example  with  added  random  noise  is 
used  to  demonstrate  the  developed  MBKG  surrogate  modeling 
and  sequential  sampling  method.  The  function  considered  is 
defined  as 

G2(X)  =  -1  +  (0.9063^  +  0.4226X,  - 6)2 
+(0.9063X,  +0.4226X,  -6)3 

4  (28) 
-0.6(0.9063^  +  0.4226A,  -  6)4 

-(-0.4226^+0.9063^) 
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where  Xl  and  X2  are  the  two  input  variables.  The  random  noise 
added  to  Eq.  (28)  is  generated  using  a  normal  distribution  with 
zero  mean  and  a  standard  deviation  of  cr£  as 

e~N(0,a2£)  (29) 

The  response  values  with  noise  are  then  calculated  as  a 
summation  of  the  true  response  value  without  noise  as  given  by 
Eq.  (28)  and  a  realization  of  s  from  the  distribution  in  Eq.  (29). 
This  is  expressed  as 

yt  ~  G2  (xf ) + (30) 

where  y.  is  the  ith  response  value  with  noise  at  x. ,  which  is  the 
ith  DoE  sample  point,  and  £i  is  the  ith  realization  of  noise. 

The  reliability  analysis  of  Eq.  (30)  is  to  be  carried  out  using 
the  two  input  variables  Xl  and  X2 .  To  carry  out  the  reliability 
analysis  the  two  input  variables  are  assumed  to  each  follow  a 
normal  distribution  as 

Xx  ~7V(5.19,0.32) 

(31) 

X2  ~  Af(0.74,0.32) 

and  are  correlated  to  each  other  by  Clayton  copula  with  a 
correlation  coefficient  of  0.5.  Because  of  this  correlation  the 
domain  of  interest  for  this  example  forms  the  raindrop  shape  seen 
in  Fig.  2.  This  domain  of  interest  is  referred  to  as  the  local 
window  [48]. 

Thus,  when  fitting  the  MBKG  surrogate  model  a  local 
window  based  on  the  distributions  in  Eq.  (31)  and  the  Clayton 
copula  correlation  was  used.  It  should  be  noted  that  the 
correlation  of  Xx  and  X2  is  used  only  to  construct  the  local 
window,  i.e.,  the  domain  of  the  MBKG  surrogate  model.  The 
correlation  is  not  used  and  does  not  affect  the  fitting  of  the 
MBKG  surrogate  model.  By  using  a  local  window,  the  DoE 
samples  used  to  fit  the  MBKG  surrogate  model  only  need  to  be 
within  the  domain  of  the  local  window  and  not  the  entire  domain 
of  Xx  and  X2  for  Eq.  (28). 

To  start  the  fitting  process  of  the  MBKG  surrogate  model, 
1 000  uniform  randomly  generated  test  points  were  generated  in 
the  local  window  as  shown  in  Fig.  2.  The  black  filled  circle  in 
Fig.  2  is  the  location  of  the  mean  value  for  X}  and  X2  as  given 
in  Eq.  (31). 

The  DoE  samples  used  to  fit  the  MBKG  surrogate 
model  are  selected  from  among  the  test  points  shown  in 
Fig.  2.  The  initial  25  DoE  samples  used  to  initially  fit  the 
MBKG  surrogate  model  were  selected  such  that  they 
uniformly  covered  the  domain  and  are  shown  in  Fig.  3  as 
the  blue  circles.  The  contour  of  the  true  limit  state  of  Eq. 


(28)  is  also  shown  in  Fig.  3  as  the  red  curve.  Using  the 
DoE  samples  and  Eq.  (30),  response  values  are  generated 
and  used  to  fit  the  MBKG  surrogate  model. 

For  the  MBKG  surrogate  model,  a  second  order  mean 
structure  and  the  spline  correlation  function  were  used. 
The  values  used  for  the  prior  parameters  are  given  in  Table 
2.  Markov  chain  Monte  Carlo  (MCMC)  was  carried  out 
using  the  25  initial  DoE  and  response  values  given  by  Eq. 
(30),  where  <j£  =  0.01  was  used  with  Eq.  (29)  to  generate 
the  noise  in  the  response.  Three  parallel  chains,  each  of 
150,000  iterations  were  generated  using  MCMC.  The  first 
100,000  iterations  were  discarded  as  burn-in. 


Figure  2.  A  1000  Uniform  Randomly  Generated  Test 
Points 


xi 

Figure  3.  Initial  25  DoE  and  Limit  State  of  Eq.  (28) 
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Table  2.  Prior  Parameter  Values  for  Fitting  Eq.  (30) 


Unknown 

Parameter 

Prior 

Distribution 

Parameter  1 

Parameter  2 

p 

M.V.  Normal 

Vp  =0 

= 6501 

cr2 

Inverse- 

Gamma 

aa  =  2.0000 

Pa  =  2-500 

X 

Inverse- 

Gamma 

ax  =  1.9548 

A  =0.2747 

4 

Uniform 

a9t  =0.1500 

b8i  =5.0000 

To  generate  the  contour  plot  of  the  limit  state,  response 
values  were  predicted  using  the  fitted  MBKG  surrogate 
model.  The  mean  values  of  the  response  values  were  then 
used  to  generate  the  contour  plot  of  limit  state  given  by  the 
MBKG  surrogate  model.  The  predicted  limit  state  is  shown  in 
Fig.  4  as  the  green  curve.  It  is  easily  seen  that  the  initial 
prediction  of  the  limit  state  does  not  match  the  true  limit  state 
(red  curve  in  Fig.  4)  very  well.  To  improve  the  prediction  of  the 
limit  state  the  sequential  sampling  method  developed  in  Section 
5  will  be  used  to  select  20  more  DoE  samples.  The  blue  circles 
in  Fig.  4  are  the  initial  25  DoE  samples  used.  The  yellow  filled 
circles  in  Fig.  4  are  the  test  points  whose  95%  credible  set  of  the 
predicted  response  value  capture  zero.  As  seen  in  Fig.  4  the 
yellow  filled  circles  cover  a  large  portion  of  the  local  window. 
This  indicates  that  there  is  a  large  amount  of  uncertainty  in  the 
predicted  response  values.  Using  the  sequential  sampling 
method  developed  in  Section  5,  20  more  DoE  samples  are 
selected.  The  20  additional  DoE  samples  are  shown  in  Fig.  4  as 
the  black  squares.  It  is  seen  how  the  20  additional  DoE  samples 
are  selected  such  that  they  are  roughly  uniformly  spaced  from 
the  existing  25  initial  DoE  samples.  Using  the  45  DoE  samples 
the  MBKG  surrogate  model  is  refitted.  The  contour  of  the  limit 
state  given  by  MBKG  surrogate  model  using  the  45  DoE  samples 
is  shown  as  the  blue  curve  in  Fig.  4.  The  test  points  are  predicted 
again  using  the  updated  MBKG  surrogate  model.  The  test  points 
whose  95%  credible  set  that  capture  zero  are  shown  as  the  black 
dots  in  Fig.  5.  In  Fig.  5  it  can  be  seen  how  the  95%  credible  set 
capture  the  limit  state  shrank,  i.e.,  the  uncertainty  about  the 
location  of  the  limit  state  is  smaller. 


4  4.5  5  5.5  6 


X1 

Figure  4.  Contours  of  MBKG  Limit  State  and  Test  Points 
with  95%  Credible  Set  Capturing  Limit  State  Using  25  DoE 


xi 

Figure  5.  Test  Points  with  95%  Credible  Set  Capturing 
Limit  State  Using  45  DoE 


The  sequential  sampling  process  is  repeated  three  more 
times,  adding  20  DoE  samples  each  time  until  a  total  of  105  DoE 
samples  are  used  to  fit  the  MBKG  surrogate  model.  Figure  6 
shows  the  true  limit  state  of  Eq.  (28)  as  the  red  curve,  the 
predicted  limit  state  given  by  the  MBKG  surrogate  model  as  the 
blue  curve,  the  initial  25  DoE  as  the  blue  circles,  the  first  set  of 
20  sequential  DoE  as  the  black  squares,  the  second  set  of  20 
sequential  DoE  as  the  pink  circles,  the  third  set  of  20  sequential 
DoE  as  the  blue  X’s,  and  the  fourth  set  of  20  sequential  DoE  as 
the  blue  dots. 
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Figure  6.  MBKG  Limit  State  and  105  DoE  Samples 


It  was  demonstrated  visually  how  the  prediction  of  the  limit 
state  is  improved  using  the  sequential  sampling  method.  To 
numerically  study  how  the  limit  state  is  improving,  a  reliability 
analysis  using  the  MBKG  surrogate  model  is  carried  out.  The 
probability  of  failure  is  predicted  for  each  MCMC  iteration.  This 
gives  the  distribution  of  the  probability  of  failure.  This 
distribution  characterizes  the  amount  of  uncertainty  in  the 
prediction  of  the  probability  of  failure.  Figure  7  shows  the 
posterior  distribution  of  the  probability  of  failure  when  only  the 
initial  25  DoE  samples  are  used  to  fit  the  MBKG  surrogate 
model.  Figure  8  shows  the  posterior  distribution  of  the 
probability  of  failure  when  65  DoE  samples  are  used  to  fit  the 
MBKG  surrogate  model.  The  additional  40  DoE  samples  were 
added  20  at  a  time  using  the  sequential  sampling  method.  Figure 
9  shows  the  posterior  distribution  of  the  probability  of  failure 
when  105  DoE  samples  are  used  to  fit  the  MBKG  surrogate 
model.  Studying  Figs.  7-9  it  can  be  seen  that  the  posterior 
distribution  of  the  probability  of  failure  is  becoming  narrower 
and  narrower,  indicating  that  amount  of  uncertainty  is 
decreasing. 

Table  3  shows  the  probability  of  failure  statistics  using  the 
different  number  of  DOE  samples.  As  seen  in  Table  3  the  mean 
value  of  the  probability  of  failure  is  converging  to  the  true 
probability  of  failure.  Also  seen  in  Table  3  is  how  the  standard 
deviation  of  the  probability  of  failure  decreases  as  more  DoE 
samples  are  used.  The  95%  credible  set  of  the  probability  of 
failure  shrinks  as  more  DoE  samples  are  added.  The  decrease  in 
the  standard  deviation  and  the  shrinkage  of  the  95%  credible  set 
indicates  reflects  how  there  is  less  uncertainty  in  the  prediction 
of  the  probability  of  failure  as  more  DoE  samples  are  added. 


Probability  of  Failure 

Figure  7.  Posterior  Distribution  of  Probability  of  Failure 
Using  25  DoE 


Probability  of  Failure 

Figure  8.  Posterior  Distribution  of  Probability  of  Failure 
Using  65  DoE 
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Figure  9.  Posterior  Distribution  of  Probability  of  Failure 
Using  105  DoE 
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The  above  study  was  done  using  <j£  —  0.01  with  Eq.  (29)  to 
generate  the  noise  in  the  response.  The  MBKG  surrogate 
modeling  and  sequential  sampling  method  was  also  studied  for 
two  higher  noise  levels  of  <j£  =  0.05  and  <j£-  0.1.  The 
probability  of  failure  statistics  for  noise  levels  when  cr£  =  0.05 
and  o£  =0.1  are  shown  in  Tables  4  and  5,  respectively.  From 
Tables  3-5  it  can  be  seen  that  the  mean  value  of  the  probability 


of  failure  is  converging  to  the  true  probability  of  failure  as  the 
number  of  DoE  increases.  It  can  also  be  seen  that  the  credible 
sets  for  the  larger  noise  are  slightly  wider  than  for  the  smaller 
noise.  This  indicates  that  with  the  larger  noise  there  is  more 
uncertainty  in  the  prediction  of  the  probability  of  failure.  It  can 
also  be  seen  that  the  95%  credible  sets  for  all  noise  levels  does 
in  fact  capture  the  true  probability  of  failure.  Thus,  the  method 
is  robust  for  different  levels  of  noise  in  the  response. 


Table  3.  Probability  of  Failure  Statistics  for  <j£  =0.01 


#  DoE 
Samples 

Mean 

Std. 

95%  Credible  Set 

25 

42.12% 

0.1809 

8.10% 

72.75% 

45 

45.55% 

0.0875 

25.29% 

59.82% 

65 

47.36% 

0.0473 

37.32% 

55.79% 

85 

48.15% 

0.0253 

43.05% 

53.00% 

105 

48.26% 

0.0180 

44.57% 

51.68% 

True 

48.47% 

N/A 

N/A 

N/A 

Table  4.  Probability  of  Failure  Statistics  for  <j£  =  0.05 


#  DoE 
Samples 

Mean 

Std. 

95%  Credible  Set 

25 

42.48% 

0.1800 

8.15% 

72.71% 

45 

46.13% 

0.0892 

25.17% 

60.44% 

65 

47.52% 

0.0471 

37.34% 

55.75% 

85 

48.62% 

0.0244 

43.72% 

53.33% 

105 

48.30% 

0.0203 

44.10% 

52.12% 

True 

48.47% 

N/A 

N/A 

N/A 

Table  5.  Probability  of  Failure  Statistics  for  cr£  =  0.1 


#  DoE 
Samples 

Mean 

Std. 

95%  Credible  Set 

25 

42.94% 

0.1786 

8.42% 

72.61% 

45 

47.64% 

0.0865 

27.05% 

61.34% 

65 

48.79% 

0.0488 

38.26% 

57.33% 

85 

47.75% 

0.0294 

41.51% 

53.14% 

105 

47.98% 

0.0233 

43.00% 

52.19% 

True 

48.47% 

N/A 

N/A 

N/A 

6.2  A  3-D  Multibody  Dynamics  Engineering  Example 

This  section  will  present  a  3-D  multibody  dynamics  (MBD) 
block-car  example  that  uses  the  developed  methods  to  estimate 


the  probability  of  failure  for  the  current  design.  The  problem  is 
a  simple  example  to  demonstrate  using  the  method  for  an  actual 
engineering  problem  that  contains  noise  in  the  response.  Figure 
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10  shows  the  multibody  dynamics  block-car  used  for  this 
example.  The  car  is  modeled  as  a  simple  block  with  four  wheels 
as  shown  in  Fig.  10.  The  origin  of  the  local  coordinate  system 
for  the  car  body  is  located  at  the  center  of  the  car  body  as  shown 
in  Fig.  10.  Reliability  analysis  of  this  example  was  attempted 
previously  using  standard  Kriging  methods  as  the  surrogate 
modeling  method.  However,  the  standard  Kriging  methods 
failed  when  trying  to  create  the  surrogate  model  for  the  contact 
force  due  to  the  noise  in  the  contact  force  response.  The  standard 
Kriging  method  failed  in  that  the  predicted  response  surface  was 
not  a  smooth  surface.  The  predicted  response  surface  looked  like 
white  noise,  i.e.,  it  was  not  smooth  and  was  jagged.  This 
example  was  the  motivation  that  led  to  the  research  carried  out 
in  this  work  and  the  development  of  the  methods  for  handling 
noisy  response  problems. 


Figure  10.  Multibody  Dynamics  Block-Car  Example 


The  three  design  variables  are  the  mass  of  the  car  and  the 
locations  of  the  center  of  mass  in  the  x  and  z  directions.  The 
two  performance  measures  of  interest  for  this  example  are  the 
contact  forces  between  the  front  wheels  and  the  ground.  The 
contact  force  should  not  be  less  than  125  pounds,  i.e.,  failure  is 
defined  as  the  contact  force  being  less  than  125  pounds.  These 
constraints  are  imposed  so  that  the  car  does  not  flip  over 
backwards  when  going  up  the  incline.  Contact  forces  calculated 
by  MBD  simulation  software  are  known  to  be  inherently  noisy 
responses.  For  this  example  the  commercially  available  MBD 
software  package  used  was  RecurDyn  [49].  The  values  of  the 
three  variables  for  the  current  design  are  defined  in  Table  6.  The 
negative  numbers  in  Table  6  mean  that  the  center  of  mass  is 
located  to  the  back  and  the  bottom  of  the  car,  i.e.,  the  negative  x 
and  z  direction. 


using  MCMC.  The  first  150,000  iterations  were  discarded  as 
burn-in.  The  remaining  50,000  iterations  were  pooled  to  give  a 
total  of  150,000  samples  from  the  posterior  distribution  to  use 
for  inference.  The  fitting  of  the  MBKG  surrogate  model  started 
with  25  DoE  samples  with  additional  DoE  samples  added  using 
the  developed  sequential  sampling  method. 

Tables  7  and  8  show  the  probability  of  failure  statistics  as 
more  DoE  samples  are  added  to  the  MBKG  surrogate  model.  As 
seen  in  both  tables  the  standard  deviations  are  large  and  95% 
credible  sets  are  wide  when  using  only  25  DoE  samples.  This 
indicates  that  there  is  a  large  amount  of  uncertainty  in  the 
estimation  of  the  probability  of  failure.  The  standard  deviation 
decreases  as  the  number  of  the  DoE  increases.  The  95%  credible 
sets  shrink  noticeably  up  until  about  105  DoE  are  added.  From 
105  DoE  samples  until  165  DoE  samples,  the  estimates  of  the 
mean,  standard  deviation,  and  95%  credible  set  appear  to  be 
converging.  Thus,  the  developed  methods  look  to  be  performing 
well  for  this  example. 


Table  6.  Current  Design  of  MBD  Block-Car 


Design 

Variable 

Current  Design 

dx  =  Mass 

65.00 

d2  =  CM  x 

-33.92 

d3  =  CM  z 

-43.78 

For  the  MBKG  surrogate  model  a  second  order  mean 
structure  and  spline  correlation  function  were  used.  Three 
parallel  chains,  each  with  200,000  iterations,  were  generated 
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Table  7.  Probability  of  Failure  Statistics  for  Left  Front  Wheel 


#  DoE 
Samples 

Mean 

Std. 

95%  Credible  Set 

25 

38.62% 

0.1310 

16.35% 

66.65% 

45 

42.94% 

0.0756 

28.10% 

58.16% 

65 

47.76% 

0.0500 

38.11% 

58.03% 

85 

44.15% 

0.0367 

36.72% 

51.44% 

105 

43.19% 

0.0320 

37.06% 

49.62% 

125 

42.65% 

0.0262 

37.58% 

47.92% 

145 

43.16% 

0.0236 

38.67% 

47.99% 

165 

41.22% 

0.0223 

37.11% 

45.82% 

Table  8.  Probability  of  Failure  Statistics  for  Right  Front  Wheel 


#  DoE 
Samples 

Mean 

Std. 

95%  Credible  Set 

25 

48.30% 

0.1359 

23.05% 

75.32% 

45 

55.63% 

0.0815 

39.43% 

71.25% 

65 

64.38% 

0.0492 

54.25% 

73.64% 

85 

65.75% 

0.0352 

58.55% 

72.51% 

105 

63.79% 

0.0290 

57.91% 

69.40% 

125 

63.23% 

0.0225 

58.56% 

67.54% 

145 

61.69% 

0.0202 

57.74% 

65.80% 

165 

59.43% 

0.0210 

55.28% 

63.53% 

7.  CONCLUSION 

The  use  of  surrogate  models  for  carrying  out  reliability 
analysis  is  becoming  popular.  However,  interpolation-based 
surrogate  modeling  methods  break  down  when  the  response 
values  contain  noise.  On  the  other  hand,  for  regression-based 
surrogate  modeling  methods  that  can  handle  responses  with 
noise,  the  uncertainty  in  the  surrogate  model  and  predicted 
response  values  is  unknown.  A  modified  Bayesian  Kriging 
(MBKG)  surrogate  modeling  method  for  problems  in  which 
simulation  analyses  are  inherently  noisy  was  successfully 
developed.  The  MBKG  surrogate  modeling  method  was 
developed  to  be  used  to  carryout  reliability  analysis  to  predict  the 
probability  of  failure.  The  posterior  distributions  generated  by 
the  MBKG  surrogate  model  characterize  the  uncertainty  in  the 
estimated  parameters.  A  sequential  sampling  method  that  uses 
the  posterior  credible  sets  of  the  MBKG  surrogate  model  was 
successfully  developed. 

As  with  any  method,  the  developed  method  does  have  some 
limitations.  Like  all  surrogate  modeling  methods,  as  the  number 
of  dimensions  grows,  the  difficulty  and  the  number  of  DoE 
required  to  build  an  accurate  surrogate  model  grows.  Thus,  for 
a  high  dimensional  problem  the  MBKG  surrogate  modeling 
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method  may  not  be  well  suited.  A  more  detailed  study  on  how 
many  dimensions  the  MBKG  surrogate  modeling  method  can 
handle  is  left  as  future  research  to  be  carried  out.  The 
disadvantage  of  using  Markov  chain  Monte  Carlo  (MCMC)  to 
fit  the  MBKG  surrogate  model  is  the  long  computational  time 
required.  Each  of  the  MCMC  iterations  of  a  chain  has  to  be 
drawn  serially,  so  if  a  large  number  of  iterations  is  required  to 
converge  to  the  posterior  distribution,  the  computational  time 
will  be  long.  However,  multiple  MCMC  chains  can  be  run  in 
parallel  to  reduce  the  computational  time.  Another  difficulty 
with  the  MBKG  surrogate  modeling  method  is  determining 
priors  to  be  used  for  parameters  of  the  MBKG  surrogate  model. 
For  the  examples  presented  in  this  paper  relatively 
noninformative  priors  were  used.  However,  it  might  be  possible 
to  come  up  with  more  informative  priors  which  would  reduce  the 
total  number  of  DoE  samples  required  to  fit  the  MBKG  surrogate 
model.  Studying  the  effects  of  the  priors  on  fitting  the  MBKG 
surrogate  model  is  left  as  future  research. 

The  new  methods  were  demonstrated  with  a  2-D 
mathematical  example  and  a  3-D  MBD  engineering  example. 
The  examples  showed  how  the  posterior  distribution  of  the 
probability  of  failure  becomes  narrower  as  more  DoE  samples 
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are  used.  It  was  also  shown  that  the  mean  value  of  the  estimated 
probability  of  failure  converged  to  the  true  probability  of  failure 
as  more  DoE  samples  were  sequentially  added  to  the  MBKG 
surrogate  model.  Thus,  the  two  examples  successfully 
demonstrated  the  developed  methods. 
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